Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis
نویسندگان
چکیده
This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/\/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and com-pared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) con-version model achieves 98 % accuracy.
منابع مشابه
Grapheme-to-Phoneme Conversion for Amharic Text-to-Speech System
Developing correct Grapheme-to-Phoneme (GTP) conversion method is a central problem in text-tospeech synthesis. Particularly, deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. This paper describes an architecture, a preprocessing...
متن کاملDialect variation in Boro Language and Grapheme-to-Phoneme conversion rules to handle lexical lookup fails in Boro TTS System
It is not possible to include all the words in a natural language for general text-to-speech system. Grapheme-tophoneme conversion system is essential to pronounce a word which is out of vocabulary. Grapheme-to-phoneme rules play a vital role where lexical lookup fails. Though basic Grapheme-tophoneme rules system is very simple yet it is very powerful for naturalness of a TTS system. Letter-to...
متن کاملDecision Tree Learning for Automatic Grapheme to Phoneme Conversion for Tamil N.Udhyakumar, C.S.Kumar, R.Srinivasan and R.Swaminathan
This paper describes a novel approach for grapheme to phoneme conversion using decision tree learning technique. The proposed approach, unlike the rule based approach, can generate rules spanning wider context and thus give better accuracy for the conversion.
متن کاملPhonological variation: epenthesis and deletion of schwa in Dutch
Two types of phonological variation in Dutch, resulting from optional rules, are schwa epenthesis and schwa deletion. In a lexical decision experiment it was investigated whether the phonological variants were processed similarly to the standard forms. It was found that the two types of variation patterned differently. Words with schwa epenthesis were processed faster and more accurately than t...
متن کاملA Diachronic Approach for Schwa Deletion in Indo Aryan Languages
Schwa deletion is an important issue in grapheme-to-phoneme conversion for IndoAryan languages (IAL). In this paper, we describe a syllable minimization based algorithm for dealing with this that outperforms the existing methods in terms of efficiency and accuracy. The algorithm is motivated by the fact that deletion of schwa is a diachronic and sociolinguistic phenomenon that facilitates faste...
متن کامل